SWAGOLX.EXE (c) 1993 GDSOFT ALL RIGHTS RESERVED 00003 1 08-24-9413:19ALL EDWIN GROOTHUIS Identify Archive Formats SWAG9408 ╝ ·╬ 12 «Q π{$define ARJ}π{$define ZIP}π{$define ARC}π{$define LZH}π{$define ZOO}ππfunction IdentifyArchive(const Name:string):char;π{π returns:π '?': unknown archiveπ 'A': Arj-archive;π 'Z': Zip-archiveπ 'L': Lzh-archiveπ 'C': Arc-archiveπ 'O': Zoo-archiveπ}πvar f:PBufStream;π a:array[0..10] of char;π bc:word;π s:string;πbeginπ IdentifyArchive:='?';π if Name='' thenπ exit;ππ f:=New(PBufStream,Init(Name,stOpenRead,1024));π if f^.Status<>stOk thenπ beginπ Dispose(f,Done);π exit;π end;ππ f^.Read(a,sizeof(a));π if f^.Status<>stOk thenπ beginπ Dispose(f,Done);π exit;π end;π Dispose(f,Done);ππ{$ifdef arj}π if (a[0]=#$60) and (a[1]=#$EA) thenπ beginπ IdentifyArchive:='A'; { ARJ }π exit;π end;π{$endif}ππ{$ifdef zip}π if (a[0]='P') and (a[1]='K') thenπ beginπ IdentifyArchive:='Z'; { ZIP }π exit;π end;π{$endif}ππ{$ifdef arc}π if a[0]=#$1A thenπ beginπ IdentifyArchive:='C'; { ARC }π exit;π end;π{$endif}ππ{$ifdef zoo}π if (a[0]='Z') and (a[1]='O') and (a[2]='O') thenπ beginπ IdentifyArchive:='O'; { ZOO }π exit;π end;π{$endif}ππ{$ifdef lzh}π s:=Name;π for bc:=1 to length(s) doπ s[bc]:=upcase(s[bc]);π if copy(s,pos('.',s),4)='.LZH' thenπ beginπ IdentifyArchive:='L'; { LZH }π exit;π end;π{$endif}ππ IdentifyArchive:='?';πend;π 2 08-24-9413:21ALL KAI ROHRBACHER Arithmetic compression SWAG9408 3k╗╬ 29 «Q {πHello Thomas,ππOn 26.06.94 you wrote in area PASCAL to subject "Arithmetic compression":πTW> But where can we get a discription of this compression method ??π Michael Barnsley, Lyman Hurd, "Fractal Image Compression", AK Peters,π 1993π Mark Nelson, "The Data Compression Book", M&T Books, 1991π Ian Witten, Radford Neal, John Cleary, "Arithmetic Coding for Dataπ Compression", CACM, Vol. 30, No.6, 1987ππ Below is a small source from the 1st book, translated into Pascal andπ adopted to work on the uppercase alphabet to demonstrate the basicπ principles.π For a simple explanation, the program uses the letters of the inputπ string to "drive" the starting point through the real interval 0.0 ..π 1.0π By this process, every possible input string stops at a unique point,π that is: a point (better: a small interval section) represents theπ whole string. To _decode_ it, you have to reverse the process: youπ start at the given end point and apply the reverse transformation,π noting which intervals you are touching at your voyage throughout theπ computation.π Due to the restricted arithmetic resolution of any computer language,π the max. length of a string will be restricted, too (try it out withπ TYPE REAL=EXTENDED, for example); this happens when the valueπ "underflows" the computers precision. }ππ{$A+,B-,D+,E+,F-,G-,I+,L+,N+,O-,P+,Q-,R+,S+,T-,V+,X+,Y+}π{$M 16384,0,655360}πPROGRAM arithmeticCompression;πUSES CRT;πCONST charSet:STRING='ABCDEFGHIJKLMNOPQRSTUVWXYZ ';π size=27; {=Length(charSet)}π p:ARRAY[1..size] OF REAL= (* found empirically *)π (π 6.1858296469E-02,π 1.1055412402E-02,π 2.6991022453E-02,π 2.6030374520E-02,π 9.2418577127E-02,π 2.1864028512E-02,π 1.4977615842E-02,π 2.8410764564E-02,π 5.5247871050E-02,π 1.3985123226E-03,π 3.8001321554E-03,π 3.2593032914E-02,π 2.1919756707E-02,π 5.2434924064E-02,π 5.7837905257E-02,π 2.0364674693E-02,π 1.0031075103E-03,π 4.9730779744E-02,π 4.8056280170E-02,π 7.2072478498E-02,π 2.0948493879E-02,π 8.2477728625E-03,π 1.0299101184E-02,π 4.7873173243E-03,π 1.3613601926E-02,π 2.7067980437E-03,π 2.3933136781E-01π );πVAR psum:ARRAY[1..size] OF REAL;ππ FUNCTION Encode(CONST s:STRING):REAL;π VAR i,po:INTEGER;π offset,len:REAL;π BEGINπ offset:=0.0;π len:=1.0;π FOR i:=1 TO Length(s) DOπ BEGINπ po:=POS(s[i],charSet);π IF po<>0π THEN BEGINπ offset:=offset+len*psum[po];π len:=len*p[po]π ENDπ ELSE BEGINπ WRITELN('only input chars ',charSet,' allowed!');π Halt(1)π END;π END;π Encode:=offset+len/2;π END;ππ FUNCTION Decode(x:REAL; n:BYTE):STRING;π VAR i,j:INTEGER;π s:STRING;π BEGINπ IF (x<0.0) OR (x>1.0)π THEN BEGINπ WRITELN('must lie in the range [0..1]');π Halt(1)π END;π FOR i:=1 TO n DOπ BEGINπ j:=size;π WHILE x<psum[j] DO DEC(j);π s[i]:=charSet[j];π x:=x-psum[j];π x:=x/p[j];π END;π s[0]:=CHR(n);π Decode:=sπ END;ππCONSTπ inp='ARITHMETIC';πVARπ r:REAL;π i,j:INTEGER;ππBEGINππ FOR i:=1 TO size DOπ BEGINπ psum[i]:=0.0;π FOR j:=1 TO i-1 DOπ psum[i]:=psum[i]+p[j];π END;ππ ClrScr;π WRITELN('encoding string : ',inp);π r:=Encode(inp);π WRITELN('string is encoded by ',r);π WRITELN('decoding of r gives: ',Decode(r,Length(inp)));ππEND.ππ 3 08-24-9417:57ALL PHIL KATZ Zip File Format SWAG9408 R'}P 62 «Q πSystem of Origin : IBMππOriginal author : Phil KatzππFILE FORMATπ-----------ππFiles stored in arbitrary order. Large zipfiles can span multipleπdiskette media. π π Local File Header 1 π file 1 extra field π file 1 comment π file data 1 π Local File Header 2 π file 2 extra field π file 2 commentπ file data 2π . π . π . π Local File Header n π file n extra field π file n comment π file data n π Central Directory π central extra fieldπ central commentπ End of Central Directoryπ end commentπEOFπππLOCAL FILE HEADERπ-----------------ππOFFSET LABEL TYP VALUE DESCRIPTIONπ------ ----------- ---- ----------- ---------------------------------- π00 ZIPLOCSIG HEX 04034B50 ;Local File Header Signature π04 ZIPVER DW 0000 ;Version needed to extract π06 ZIPGENFLG DW 0000 ;General purpose bit flag π08 ZIPMTHD DW 0000 ;Compression method π0A ZIPTIME DW 0000 ;Last mod file time (MS-DOS) π0C ZIPDATE DW 0000 ;Last mod file date (MS-DOS) π0E ZIPCRC HEX 00000000 ;CRC-32π12 ZIPSIZE HEX 00000000 ;Compressed size π16 ZIPUNCMP HEX 00000000 ;Uncompressed sizeπ1A ZIPFNLN DW 0000 ;Filename lengthπ1C ZIPXTRALN DW 0000 ;Extra field length π1E ZIPNAME DS ZIPFNLN ;filename π-- ZIPXTRA DS ZIPXTRALN ;extra field π πCENTRAL DIRECTORY STRUCTUREπ--------------------------- π πOFFSET LABEL TYP VALUE DESCRIPTIONπ------ ----------- ---- ----------- ----------------------------------π00 ZIPCENSIG HEX 02014B50 ;Central file header signature π04 ZIPCVER DB 00 ;Version made by π05 ZIPCOS DB 00 ;Host operating system π06 ZIPCVXT DB 00 ;Version needed to extract π07 ZIPCEXOS DB 00 ;O/S of version needed for extraction π08 ZIPCFLG DW 0000 ;General purpose bit flag π0A ZIPCMTHD DW 0000 ;Compression method π0C ZIPCTIM DW 0000 ;Last mod file time (MS-DOS)π0E ZIPCDAT DW 0000 ;Last mod file date (MS-DOS) π10 ZIPCCRC HEX 00000000 ;CRC-32π14 ZIPCSIZ HEX 00000000 ;Compressed sizeπ18 ZIPCUNC HEX 00000000 ;Uncompressed size π1C ZIPCFNL DW 0000 ;Filename length π1E ZIPCXTL DW 0000 ;Extra field length π20 ZIPCCML DW 0000 ;File comment length π22 ZIPDSK DW 0000 ;Disk number startπ24 ZIPINT DW 0000 ;Internal file attributes π π LABEL BIT DESCRIPTIONπ ----------- --------- -----------------------------------------π ZIPINT 0 if = 1, file is apparently an ASCII or π text file π 0 if = 0, file apparently contains binary π data ππ 1-7 unused in version 1.0.π π26 ZIPEXT HEX 00000000 ;External file attributes, host π ;system dependentπ2A ZIPOFST HEX 00000000 ;Relative offset of local header π ;from the start of the first disk π ;on which this file appearsπ2E ZIPCFN DS ZIPCFNL ;Filename or path - should not π ;contain a drive or device letter, π ;or a leading slash. All slashes π ;should be forward slashes '/' π-- ZIPCXTR DS ZIPCXTL ;extra fieldπ-- ZIPCOM DS ZIPCCML ;file commentπππEND OF CENTRAL DIR STRUCTUREπ---------------------------- π πOFFSET LABEL TYP VALUE DESCRIPTION π------ ----------- ---- ----------- ---------------------------------- π00 ZIPESIG HEX 06064B50 ;End of central dir signatureπ04 ZIPEDSK DW 0000 ;Number of this disk π06 ZIPECEN DW 0000 ;Number of disk with start central dir π08 ZIPENUM DW 0000 ;Total number of entries in central dir π ;on this disk π0A ZIPECENN DW 0000 ;total number entries in central dir π0C ZIPECSZ HEX 00000000 ;Size of the central directoryπ10 ZIPEOFST HEX 00000000 ;Offset of start of central directory π ;with respect to the starting diskπ ;number π14 ZIPECOML DW 0000 ;zipfile comment length π16 ZIPECOM DS ZIPECOML ;zipfile commentπ π πZIP VALUES LEGENDπ-----------------π π HOST O/S π π VALUE DESCRIPTION VALUE DESCRIPTION π ----- -------------------------- ----- ------------------------π 0 MS-DOS and OS/2 (FAT) 5 Atari ST π 1 Amiga 6 OS/2 1.2 extended file sys π 2 VMS 7 Macintosh π 3 *nix 8 thru π 4 VM/CMS 255 unused ππ π GENERAL PURPOSE BIT FLAG π π LABEL BIT DESCRIPTION π ----------- --------- -----------------------------------------π ZIPGENFLG 0 If set, file is encrypted π or 1 If file Imploded and this bit is set, 8K π ZIPCFLG sliding dictionary was used. If clear, 4Kπ sliding dictionary was used.π 2 If file Imploded and this bit is set, 3 π Shannon-Fano trees were used. If clear, 2 π Shannon-Fano trees were used. π 3-4 unused π 5-7 used internaly by ZIPπ π Note: Bits 1 and 2 are undefined if the compression method is π other than type 6 (Imploding). π ππ COMPRESSION METHODπ π NAME METHOD DESCRIPTION π ----------- ------ -------------------------------------------- π Stored 0 No compression used π Shrunk 1 LZW, 8K buffer, 9-13 bits with partial clearing π Reduced-1 2 Probalistic compression, L(X) = lower 7 bits π Reduced-2 3 Probalistic compression, L(X) = lower 6 bits π Reduced-3 4 Probalistic compression, L(X) = lower 5 bits π Reduced-4 5 Probalistic compression, L(X) = lower 4 bitsπ Imploded 6 2 Shanno-Fano trees, 4K sliding dictionaryπ Imploded 7 3 Shanno-Fano trees, 4K sliding dictionary π Imploded 8 2 Shanno-Fano trees, 8K sliding dictionaryπ Imploded 9 3 Shanno-Fano trees, 8K sliding dictionary ππ π EXTRA FIELD ππ OFFSET LABEL TYP VALUE DESCRIPTIONπ ------ ----------- ---- ---------- ----------------------------π 00 EX1ID DW 0000 ;0-31 reserved by PKWAREπ 02 EX1LN DW 0000π 04 EX1DAT DS EX1LN ;Specific data for individualπ . ;files. Data field should beginπ . ;with a s/w specific unique IDπ EX1LN+4π EXnID DW 0000π EXnLN DW 0000ππ EXnDAT DS EXnLN ;entire header may not exceed 64kπππ